DistanceJoin: Pattern Match Query In a Large Graph Database

نویسندگان

  • Lei Zou
  • Lei Chen
  • M. Tamer Özsu
چکیده

The growing popularity of graph databases has generated interesting data management problems, such as subgraph search, shortest-path query, reachability verification, and pattern match. Among these, a pattern match query is more flexible compared to a subgraph search and more informative compared to a shortest-path or reachability query. In this paper, we address pattern match problems over a large data graph G. Specifically, given a pattern graph (i.e., query Q), we want to find all matches (in G) that have the similar connections as those in Q. In order to reduce the search space significantly, we first transform the vertices into points in a vector space via graph embedding techniques, coverting a pattern match query into a distance-based multi-way join problem over the converted vector space. We also propose several pruning strategies and a join order selection method to process join processing efficiently. Extensive experiments on both real and synthetic datasets show that our method outperforms existing ones by orders of magnitude.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Separating indexes from data: a distributed scheme for secure database outsourcing

Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...

متن کامل

Graph summaries for optimizing graph pattern queries on RDF databases

The adoption of the Resource Description Framework (RDF) as a metadata and semantic data representation standard is spurring the development of high-level mechanisms for storing and querying RDF data. A common approach for managing and querying RDF data is to build on Relational/Object Relational Database systems and translate queries in an RDF query language into queries in the native language...

متن کامل

Graphite: A Tool for Visually Querying Large Social Networks

We present Graphite, a system that allows the user to visually construct a query pattern, finds both exact and approximate subgraphs that match the pattern, and visualizes the matches, all in an integrated interface. Graphite can find arbitrary subgraph patterns in large graphs of nodes that have attributes, such as person-to-person social networks, where a person’s occupation is an attribute. ...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

A Parallel Tree Pattern Query Processing Algorithm for Graph Databases using a GPGPU

Large amounts of data are modeled and stored as graphs in order to express complex data relationships. Consequently, query processing on graph structures is becoming an important component in real-world applications. The most commonly used query format is that of tree pattern queries. We present a new parallel SIMD algorithm, GGQ (GPU Graph data base Query), for answering tree pattern queries o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2009